Introduction to Computer Vision: Plant Seedlings Classification¶

Problem Statement¶

Context¶

In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term. The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can actually benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor, could lead to better crop yields, the freeing up of human inolvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.

Objective¶

The aim of this project is to Build a Convolutional Neural Netowrk to classify plant seedlings into their respective categories.

Data Dictionary¶

The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has recently released a dataset containing images of unique plants belonging to 12 different species.

  • The dataset can be download from Olympus.
  • The data file names are:
    • images.npy
    • Labels.csv
  • Due to the large volume of data, the images were converted to the images.npy file and the labels are also put into Labels.csv, so that you can work on the data/project seamlessly without having to worry about the high data volume.

  • The goal of the project is to create a classifier capable of determining a plant's species from an image.

List of Species

  • Black-grass
  • Charlock
  • Cleavers
  • Common Chickweed
  • Common Wheat
  • Fat Hen
  • Loose Silky-bent
  • Maize
  • Scentless Mayweed
  • Shepherds Purse
  • Small-flowered Cranesbill
  • Sugar beet

Note: Please use GPU runtime on Google Colab to execute the code faster.¶

Importing necessary libraries¶

In [ ]:
# Installing the libraries with the specified version.
# uncomment and run the following line if Google Colab is being used
# !pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.3 opencv-python==4.8.0.76 -q --user
In [ ]:
# Installing the libraries with the specified version.
# uncomment and run the following lines if Jupyter Notebook is being used
#!pip install tensorflow==2.13.0 scikit-learn==1.2.2 seaborn==0.11.1 matplotlib==3.3.4 numpy==1.24.3 pandas==1.5.2 opencv-python==4.8.0.76 -q --user

Note: After running the above cell, kindly restart the notebook kernel and run all cells sequentially from the start again.

In [32]:
import os
import numpy as np                                                                               # Importing numpy for Matrix Operations
import pandas as pd                                                                              # Importing pandas to read CSV files
import matplotlib.pyplot as plt                                                                  # Importting matplotlib for Plotting and visualizing images
import math                                                                                      # Importing math module to perform mathematical operations
import cv2                                                                                       # Importing openCV for image processing
import seaborn as sns                                                                            # Importing seaborn to plot graphs


# Tensorflow modules
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator                              # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential                                                   # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD                                                 # Importing the optimizers which can be used in our model
from sklearn import preprocessing                                                                # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix                                                     # Importing confusion_matrix to plot the confusion matrix

# Display images using OpenCV
from google.colab.patches import cv2_imshow                                                      # Importing cv2_imshow from google.patches to display images

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
In [33]:
from tensorflow.keras import backend
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import classification_report

Loading the dataset¶

In [34]:
# Uncomment and run the below code if you are using google colab
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
In [35]:
# Load the image file of the dataset
images = np.load('/content/drive/MyDrive/Colab Notebooks/computer-vision-project/images.npy')

# Load the labels file of the dataset
labels = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/computer-vision-project/Labels.csv')

Data Overview¶

Understand the shape of the dataset¶

In [36]:
print(images.shape)
print(labels.shape)
(4750, 128, 128, 3)
(4750, 1)
In [37]:
categ = ['Black-grass', 'Charlock', 'Cleavers', 'Common Chickweed', 'Common wheat', 'Fat Hen', 'Loose Silky-bent',
              'Maize', 'Scentless Mayweed', 'Shepherds Purse', 'Small-flowered Cranesbill', 'Sugar beet']
num_categ = len(categ)
num_categ
Out[37]:
12

Observation:¶

  • Total number of plant categories are 12 ie., o/p preds should be 12
  • We have a total of 4750 plant images
  • Each image is of shape 128 X 128
  • As the number of channels is 3, images are in RGB (Red, Blue, Green)
In [38]:
#Importing ImageGrid to plot the plant sample images
from mpl_toolkits.axes_grid1 import ImageGrid
#defining a figure of size 12X12
fig = plt.figure(1, figsize=(num_categ, num_categ))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categ, num_categ), axes_pad=0.05)
i = 0
index = labels.index

#Plottting 12 images from each plant category
for category_id, category in enumerate(categ):
  condition = labels["Label"] == category
  plant_indices = index[condition].tolist()
  for j in range(0,12):
      ax = grid[i]
      ax.imshow(images[plant_indices[j]])
      ax.axis('off')
      if i % num_categ == num_categ - 1:
        #printing the names for each caterogy
        ax.text(200, 70, category, verticalalignment='center')
      i += 1
plt.show();

Plotting images using OpenCV and matplotlib¶

In [39]:
cv2_imshow(images[5])
In [40]:
plt.imshow(images[5])
Out[40]:
<matplotlib.image.AxesImage at 0x7807b3fbfc40>
  • We can observe that the images are being shown in different colors when plotted with openCV and matplotlib as OpenCV reads images in BGR format and this shows that the given numpy arrays were generated from the original images using OpenCV.
  • Now we will convert these BGR images to RGB images so we could interpret them easily.
In [41]:
# Converting the images from BGR to RGB using cvtColor function of OpenCV
for i in range(len(images)):
  images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)

Exploratory Data Analysis¶

  • EDA is an important part of any project involving data.
  • It is important to investigate and understand the data better before building a model with it.
  • A few questions have been mentioned below which will help you understand the data better.
  • A thorough analysis of the data, in addition to the questions mentioned below, should be done.
  1. How are these different category plant images different from each other?
  2. Is the dataset provided an imbalance? (Check with using bar plots)

Plotting random images from each of the class¶

In [ ]:
def plot_images(images,labels):
  num_classes=10                                                                  # Number of Classes
  categories=np.unique(labels)
  keys=dict(labels['Label'])                                                      # Obtaing the unique classes from y_train
  rows = 3                                                                        # Defining number of rows=3
  cols = 4                                                                        # Defining number of columns=4
  fig = plt.figure(figsize=(10, 8))                                               # Defining the figure size to 10x8
  for i in range(cols):
      for j in range(rows):
          random_index = np.random.randint(0, len(labels))                        # Generating random indices from the data and plotting the images
          ax = fig.add_subplot(rows, cols, i * rows + j + 1)                      # Adding subplots with 3 rows and 4 columns
          ax.imshow(images[random_index, :])                                      # Plotting the image
          ax.set_title(keys[random_index])
  plt.show()
In [ ]:
plot_images(images,labels)

Checking for data imbalance¶

In [ ]:
 sns.countplot(labels['Label'])
 plt.xticks(rotation='vertical')
Out[ ]:
(array([  0., 100., 200., 300., 400., 500., 600., 700.]),
 [Text(0.0, 0, '0'),
  Text(100.0, 0, '100'),
  Text(200.0, 0, '200'),
  Text(300.0, 0, '300'),
  Text(400.0, 0, '400'),
  Text(500.0, 0, '500'),
  Text(600.0, 0, '600'),
  Text(700.0, 0, '700')])
In [ ]:
plt.rcParams["figure.figsize"] = (12,5)
sns.countplot(x=labels.iloc[:,-1],order = labels['Label'].value_counts().index, palette='Greens_r')
plt.xlabel('Plant Categories')
plt.xticks(rotation=45)
Out[ ]:
([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
 [Text(0, 0, 'Loose Silky-bent'),
  Text(1, 0, 'Common Chickweed'),
  Text(2, 0, 'Scentless Mayweed'),
  Text(3, 0, 'Small-flowered Cranesbill'),
  Text(4, 0, 'Fat Hen'),
  Text(5, 0, 'Charlock'),
  Text(6, 0, 'Sugar beet'),
  Text(7, 0, 'Cleavers'),
  Text(8, 0, 'Black-grass'),
  Text(9, 0, 'Shepherds Purse'),
  Text(10, 0, 'Common wheat'),
  Text(11, 0, 'Maize')])

Observation:¶

  • Loose Silky-bent plant samples are more compared to other categories
  • Least plant samples are for "Common Wheat", "Maize"

Data Pre-Processing¶

Convert the BGR images to RGB images.¶

Resize the images¶

As the size of the images is large, it may be computationally expensive to train on these larger images; therefore, it is preferable to reduce the image size from 128 to 64.

In [42]:
images_decreased=[]
height = 64
width = 64
dimensions = (width, height)
for i in range(len(images)):
  images_decreased.append( cv2.resize(images[i], dimensions, interpolation=cv2.INTER_LINEAR))
In [43]:
plt.imshow(images_decreased[3])
Out[43]:
<matplotlib.image.AxesImage at 0x7807b3bdd8a0>

Visualizing images using Gaussian Blur¶

In [44]:
# Applying Gaussian Blur to denoise the images
images_gb=[]
for i in range(len(images_decreased)):
  # gb[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
  images_gb.append(cv2.GaussianBlur(images_decreased[i], ksize =(3,3),sigmaX =  0))
In [45]:
plt.imshow(images_gb[3])
Out[45]:
<matplotlib.image.AxesImage at 0x7807bddd5630>
  • It appears that GaussianBlur would be ineffective because the blurred or denoised image does not seem to contain any relevant information, and the model would struggle to categorize these blurred images.

Data Preparation for Modeling¶

  • Before you proceed to build a model, you need to split the data into train, test, and validation to be able to evaluate the model that you build on the train data
  • You'll have to encode categorical features and scale the pixel values.
  • You will build a model using the train data and then check its performance

Split the dataset

In [46]:
from sklearn.model_selection import train_test_split
X_temp, X_test, y_temp, y_test = train_test_split(np.array(images_decreased),labels , test_size=0.1, random_state=42,stratify=labels)
X_train, X_val, y_train, y_val = train_test_split(X_temp,y_temp , test_size=0.1, random_state=42,stratify=y_temp)
In [47]:
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
(3847, 64, 64, 3) (3847, 1)
(428, 64, 64, 3) (428, 1)
(475, 64, 64, 3) (475, 1)

Encode the target labels¶

In [48]:
# Convert labels from names to one hot vectors.
# We have already used encoding methods like onehotencoder and labelencoder earlier so now we will be using a new encoding method called labelBinarizer.
# Labelbinarizer works similar to onehotencoder

from sklearn.preprocessing import LabelBinarizer
enc = LabelBinarizer()
y_train_encoded = enc.fit_transform(y_train)
y_val_encoded=enc.transform(y_val)
y_test_encoded=enc.transform(y_test)

Data Normalization¶

In [49]:
# Normalizing the image pixels
X_train_normalized = X_train.astype('float32')/255.0
X_val_normalized = X_val.astype('float32')/255.0
X_test_normalized = X_test.astype('float32')/255.0

Model Building¶

In [50]:
# Clearing backend
from tensorflow.keras import backend
backend.clear_session()
In [51]:
# Fixing the seed for random number generators
import random
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
In [52]:
# Intializing a sequential model
model = Sequential()

# Adding first conv layer with 128 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension of images
model.add(Conv2D(128, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))

# Adding max pooling to reduce the size of output of first conv layer
model.add(MaxPooling2D((2, 2), padding = 'same'))

model.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
model.add(MaxPooling2D((2, 2), padding = 'same'))

model.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model.add(MaxPooling2D((2, 2), padding = 'same'))

# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model.add(Flatten())

# Adding a fully connected dense layer with 16 neurons
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.3))
# Adding the output layer with 12 neurons and activation functions as softmax since this is a multi-class classification problem
model.add(Dense(12, activation='softmax'))


# Using ADAM Optimizer
opt=Adam()
# Compile model
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Generating the summary of the model
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 128)       3584      
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 128)       0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 64)        73792     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 16, 16, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 32)        18464     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 8, 8, 32)          0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 16)                32784     
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_1 (Dense)             (None, 12)                204       
                                                                 
=================================================================
Total params: 128828 (503.23 KB)
Trainable params: 128828 (503.23 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Fitting the model on the train data¶

In [53]:
history_1 = model.fit(
            X_train_normalized, y_train_encoded,
            epochs=30,
            validation_data=(X_val_normalized,y_val_encoded),
            batch_size=32,
            verbose=2
)
Epoch 1/30
121/121 - 4s - loss: 2.4544 - accuracy: 0.1050 - val_loss: 2.4393 - val_accuracy: 0.1379 - 4s/epoch - 34ms/step
Epoch 2/30
121/121 - 1s - loss: 2.4354 - accuracy: 0.1347 - val_loss: 2.4301 - val_accuracy: 0.1402 - 1s/epoch - 9ms/step
Epoch 3/30
121/121 - 1s - loss: 2.3309 - accuracy: 0.2155 - val_loss: 2.1966 - val_accuracy: 0.2804 - 1s/epoch - 9ms/step
Epoch 4/30
121/121 - 1s - loss: 2.0663 - accuracy: 0.2927 - val_loss: 1.8985 - val_accuracy: 0.3598 - 1s/epoch - 9ms/step
Epoch 5/30
121/121 - 1s - loss: 1.9226 - accuracy: 0.3049 - val_loss: 1.7517 - val_accuracy: 0.3855 - 1s/epoch - 9ms/step
Epoch 6/30
121/121 - 1s - loss: 1.8281 - accuracy: 0.3384 - val_loss: 1.5521 - val_accuracy: 0.4229 - 1s/epoch - 10ms/step
Epoch 7/30
121/121 - 1s - loss: 1.7316 - accuracy: 0.3735 - val_loss: 1.5349 - val_accuracy: 0.4136 - 1s/epoch - 11ms/step
Epoch 8/30
121/121 - 1s - loss: 1.6610 - accuracy: 0.3923 - val_loss: 1.4390 - val_accuracy: 0.4930 - 1s/epoch - 10ms/step
Epoch 9/30
121/121 - 1s - loss: 1.5902 - accuracy: 0.4216 - val_loss: 1.3399 - val_accuracy: 0.4953 - 1s/epoch - 9ms/step
Epoch 10/30
121/121 - 1s - loss: 1.5618 - accuracy: 0.4190 - val_loss: 1.4479 - val_accuracy: 0.5280 - 1s/epoch - 9ms/step
Epoch 11/30
121/121 - 1s - loss: 1.5198 - accuracy: 0.4442 - val_loss: 1.2908 - val_accuracy: 0.5187 - 1s/epoch - 9ms/step
Epoch 12/30
121/121 - 1s - loss: 1.4840 - accuracy: 0.4494 - val_loss: 1.2796 - val_accuracy: 0.5584 - 1s/epoch - 9ms/step
Epoch 13/30
121/121 - 1s - loss: 1.4389 - accuracy: 0.4650 - val_loss: 1.2595 - val_accuracy: 0.6098 - 1s/epoch - 9ms/step
Epoch 14/30
121/121 - 1s - loss: 1.4016 - accuracy: 0.4853 - val_loss: 1.2006 - val_accuracy: 0.6028 - 1s/epoch - 9ms/step
Epoch 15/30
121/121 - 1s - loss: 1.4046 - accuracy: 0.4793 - val_loss: 1.2641 - val_accuracy: 0.5841 - 1s/epoch - 9ms/step
Epoch 16/30
121/121 - 1s - loss: 1.3488 - accuracy: 0.4942 - val_loss: 1.1949 - val_accuracy: 0.6215 - 1s/epoch - 9ms/step
Epoch 17/30
121/121 - 1s - loss: 1.3364 - accuracy: 0.5032 - val_loss: 1.1528 - val_accuracy: 0.6472 - 1s/epoch - 10ms/step
Epoch 18/30
121/121 - 1s - loss: 1.3002 - accuracy: 0.5243 - val_loss: 1.1301 - val_accuracy: 0.6379 - 1s/epoch - 10ms/step
Epoch 19/30
121/121 - 1s - loss: 1.2735 - accuracy: 0.5368 - val_loss: 1.1189 - val_accuracy: 0.6636 - 1s/epoch - 11ms/step
Epoch 20/30
121/121 - 1s - loss: 1.2322 - accuracy: 0.5547 - val_loss: 1.0387 - val_accuracy: 0.6729 - 1s/epoch - 9ms/step
Epoch 21/30
121/121 - 1s - loss: 1.1983 - accuracy: 0.5610 - val_loss: 1.0590 - val_accuracy: 0.6612 - 1s/epoch - 9ms/step
Epoch 22/30
121/121 - 1s - loss: 1.1609 - accuracy: 0.5768 - val_loss: 1.0251 - val_accuracy: 0.6682 - 1s/epoch - 9ms/step
Epoch 23/30
121/121 - 1s - loss: 1.1741 - accuracy: 0.5716 - val_loss: 1.0364 - val_accuracy: 0.6776 - 1s/epoch - 9ms/step
Epoch 24/30
121/121 - 1s - loss: 1.1370 - accuracy: 0.5870 - val_loss: 0.9634 - val_accuracy: 0.6986 - 1s/epoch - 9ms/step
Epoch 25/30
121/121 - 1s - loss: 1.1024 - accuracy: 0.5888 - val_loss: 0.9813 - val_accuracy: 0.7079 - 1s/epoch - 9ms/step
Epoch 26/30
121/121 - 1s - loss: 1.1045 - accuracy: 0.5903 - val_loss: 0.9827 - val_accuracy: 0.7056 - 1s/epoch - 9ms/step
Epoch 27/30
121/121 - 1s - loss: 1.0825 - accuracy: 0.5901 - val_loss: 0.9738 - val_accuracy: 0.7173 - 1s/epoch - 9ms/step
Epoch 28/30
121/121 - 1s - loss: 1.0598 - accuracy: 0.6044 - val_loss: 0.9969 - val_accuracy: 0.7150 - 1s/epoch - 9ms/step
Epoch 29/30
121/121 - 1s - loss: 1.0650 - accuracy: 0.5960 - val_loss: 0.9562 - val_accuracy: 0.7266 - 1s/epoch - 10ms/step
Epoch 30/30
121/121 - 1s - loss: 1.0308 - accuracy: 0.6148 - val_loss: 0.9692 - val_accuracy: 0.7220 - 1s/epoch - 10ms/step

Model Evaluation¶

In [54]:
plt.plot(history_1.history['accuracy'])
plt.plot(history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

Evaluating the model on test data¶

In [55]:
accuracy = model.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 0s - loss: 1.0321 - accuracy: 0.6926 - 332ms/epoch - 22ms/step

Generating the predictions using test data¶

In [56]:
# Here we would get the output as probablities for each category
y_pred=model.predict(X_test_normalized)
15/15 [==============================] - 0s 3ms/step
In [57]:
y_pred
Out[57]:
array([[6.3017310e-27, 1.4537698e-14, 3.1952004e-16, ..., 7.7504323e-05,
        6.6727895e-18, 1.2495816e-05],
       [6.9060018e-12, 6.7600171e-04, 9.0842653e-04, ..., 5.9727160e-04,
        9.7171086e-01, 2.5960522e-02],
       [3.2669734e-10, 9.3343559e-05, 2.7883332e-05, ..., 5.9517950e-04,
        9.8347986e-01, 1.5675658e-02],
       ...,
       [2.4058828e-01, 4.2956966e-10, 1.0346009e-04, ..., 7.8247098e-09,
        3.3028933e-09, 1.7616181e-02],
       [2.3985336e-10, 8.7407631e-07, 3.1729793e-05, ..., 4.3570727e-02,
        6.1883247e-07, 5.6416271e-03],
       [6.4866851e-13, 6.4040473e-06, 6.5012991e-06, ..., 5.2344567e-01,
        3.7282091e-04, 9.5255692e-03]], dtype=float32)

Plotting the Confusion Matrix¶

In [58]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()

Observations

  • The model achieves an accuracy of around 70% on the test data. This indicates that the model is able to correctly classify 7 out of 10 images.
  • The model seems to be overfitting the training data. This means that the model not able to generalize to unseen data.
  • The model is performing well on some classeswe like 6,3 are well classified.
  • We can also observe that classes 0,2,4,9 mostly misclassified.
  • Hyperparameter tuning, Data Augmentation, other model Arch can be used to improve the model.
  • We will try to use Data Augmentation for the next model.

Plotting Classification Report

In [59]:
# Plotting the classification report
from sklearn.metrics import classification_report
cr=classification_report((y_test_arg), y_pred_arg)     # Complete the code to plot the classification report
print(cr)
              precision    recall  f1-score   support

           0       0.00      0.00      0.00        26
           1       0.86      0.82      0.84        39
           2       0.79      0.52      0.62        29
           3       0.71      0.87      0.78        61
           4       0.00      0.00      0.00        22
           5       0.67      0.83      0.74        48
           6       0.57      0.97      0.72        65
           7       0.71      0.45      0.56        22
           8       0.63      0.71      0.67        52
           9       0.86      0.52      0.65        23
          10       0.93      0.82      0.87        50
          11       0.68      0.68      0.68        38

    accuracy                           0.69       475
   macro avg       0.62      0.60      0.59       475
weighted avg       0.65      0.69      0.66       475

Model Performance Improvement¶

Reducing the Learning Rate:

Hint: Use ReduceLRonPlateau() function that will be used to decrease the learning rate by some factor, if the loss is not decreasing for some time. This may start decreasing the loss at a smaller learning rate. There is a possibility that the loss may still not decrease. This may lead to executing the learning rate reduction again in an attempt to achieve a lower loss.

In [126]:
# Code to monitor val_accuracy
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
                                            patience=3,
                                            verbose=1,
                                            factor=0.5,
                                            min_lr=0.00001)

Data Augmentation¶

Remember, data augmentation should not be used in the validation/test data set.

In [ ]:
# Clearing backend
from tensorflow.keras import backend
backend.clear_session()

# Fixing the seed for random number generators
import random
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
In [ ]:
# Complete the code to set the rotation_range to 20
train_datagen = ImageDataGenerator(
                              rotation_range=20,
                              fill_mode='nearest'
                              )
In [ ]:
# Intializing a sequential model
model1 = Sequential()

# Adding first conv layer with 64 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension images
model1.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))

# Adding max pooling to reduce the size of output of first conv layer
model1.add(MaxPooling2D((2, 2), padding = 'same'))
# model.add(BatchNormalization())
model1.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model1.add(MaxPooling2D((2, 2), padding = 'same'))
model1.add(BatchNormalization())
# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model1.add(Flatten())

# Adding a fully connected dense layer with 16 neurons
model1.add(Dense(16, activation='relu'))
model1.add(Dropout(0.3))
# Adding the output layer with 12 neurons and activation functions as softmax since this is a multi-class classification problem
model1.add(Dense(12, activation='softmax'))

# Using ADAM Optimizer
opt=Adam()
# Compile model
model1.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Generating the summary of the model
model1.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 64)        1792      
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 64)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 32)        18464     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 16, 16, 32)        0         
 g2D)                                                            
                                                                 
 batch_normalization (Batch  (None, 16, 16, 32)        128       
 Normalization)                                                  
                                                                 
 flatten (Flatten)           (None, 8192)              0         
                                                                 
 dense (Dense)               (None, 16)                131088    
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_1 (Dense)             (None, 12)                204       
                                                                 
=================================================================
Total params: 151676 (592.48 KB)
Trainable params: 151612 (592.23 KB)
Non-trainable params: 64 (256.00 Byte)
_________________________________________________________________
In [ ]:
# Epochs
epochs = 30
# Batch size
batch_size = 64

history = model1.fit(train_datagen.flow(X_train_normalized,y_train_encoded,
                                       batch_size=batch_size,
                                      #  seed=42,
                                       shuffle=False),
                    epochs=epochs,
                    steps_per_epoch=X_train_normalized.shape[0] // batch_size,
                    validation_data=(X_val_normalized,y_val_encoded),
                    verbose=1)
Epoch 1/30
60/60 [==============================] - 6s 73ms/step - loss: 2.1391 - accuracy: 0.2437 - val_loss: 2.4193 - val_accuracy: 0.1729
Epoch 2/30
60/60 [==============================] - 6s 94ms/step - loss: 1.6844 - accuracy: 0.4068 - val_loss: 2.2743 - val_accuracy: 0.2687
Epoch 3/30
60/60 [==============================] - 7s 119ms/step - loss: 1.4572 - accuracy: 0.4914 - val_loss: 2.1665 - val_accuracy: 0.3995
Epoch 4/30
60/60 [==============================] - 6s 105ms/step - loss: 1.3813 - accuracy: 0.5170 - val_loss: 2.0749 - val_accuracy: 0.3692
Epoch 5/30
60/60 [==============================] - 4s 69ms/step - loss: 1.2408 - accuracy: 0.5654 - val_loss: 1.8004 - val_accuracy: 0.4393
Epoch 6/30
60/60 [==============================] - 5s 89ms/step - loss: 1.1574 - accuracy: 0.5903 - val_loss: 1.6260 - val_accuracy: 0.5561
Epoch 7/30
60/60 [==============================] - 4s 70ms/step - loss: 1.1518 - accuracy: 0.5987 - val_loss: 1.6562 - val_accuracy: 0.5117
Epoch 8/30
60/60 [==============================] - 6s 94ms/step - loss: 1.0460 - accuracy: 0.6305 - val_loss: 1.2586 - val_accuracy: 0.5794
Epoch 9/30
60/60 [==============================] - 4s 69ms/step - loss: 1.0314 - accuracy: 0.6315 - val_loss: 1.0449 - val_accuracy: 0.7033
Epoch 10/30
60/60 [==============================] - 5s 79ms/step - loss: 0.9738 - accuracy: 0.6505 - val_loss: 1.1829 - val_accuracy: 0.6051
Epoch 11/30
60/60 [==============================] - 9s 149ms/step - loss: 0.9529 - accuracy: 0.6614 - val_loss: 0.8352 - val_accuracy: 0.7570
Epoch 12/30
60/60 [==============================] - 5s 82ms/step - loss: 0.8908 - accuracy: 0.6746 - val_loss: 1.2102 - val_accuracy: 0.6145
Epoch 13/30
60/60 [==============================] - 5s 83ms/step - loss: 0.8936 - accuracy: 0.6823 - val_loss: 0.8875 - val_accuracy: 0.7243
Epoch 14/30
60/60 [==============================] - 4s 71ms/step - loss: 0.8442 - accuracy: 0.6942 - val_loss: 0.7931 - val_accuracy: 0.7710
Epoch 15/30
60/60 [==============================] - 6s 95ms/step - loss: 0.8217 - accuracy: 0.7047 - val_loss: 1.6192 - val_accuracy: 0.5654
Epoch 16/30
60/60 [==============================] - 4s 72ms/step - loss: 0.7934 - accuracy: 0.7092 - val_loss: 1.4385 - val_accuracy: 0.5818
Epoch 17/30
60/60 [==============================] - 4s 70ms/step - loss: 0.7830 - accuracy: 0.7132 - val_loss: 0.8582 - val_accuracy: 0.7453
Epoch 18/30
60/60 [==============================] - 8s 140ms/step - loss: 0.7780 - accuracy: 0.7203 - val_loss: 0.8741 - val_accuracy: 0.7430
Epoch 19/30
60/60 [==============================] - 10s 171ms/step - loss: 0.7511 - accuracy: 0.7264 - val_loss: 0.7935 - val_accuracy: 0.7570
Epoch 20/30
60/60 [==============================] - 4s 71ms/step - loss: 0.7208 - accuracy: 0.7380 - val_loss: 0.9378 - val_accuracy: 0.7173
Epoch 21/30
60/60 [==============================] - 6s 96ms/step - loss: 0.7362 - accuracy: 0.7256 - val_loss: 1.2306 - val_accuracy: 0.6916
Epoch 22/30
60/60 [==============================] - 4s 71ms/step - loss: 0.7807 - accuracy: 0.7111 - val_loss: 0.8008 - val_accuracy: 0.7664
Epoch 23/30
60/60 [==============================] - 5s 82ms/step - loss: 0.7025 - accuracy: 0.7457 - val_loss: 0.9828 - val_accuracy: 0.7360
Epoch 24/30
60/60 [==============================] - 4s 70ms/step - loss: 0.6858 - accuracy: 0.7518 - val_loss: 1.1810 - val_accuracy: 0.6449
Epoch 25/30
60/60 [==============================] - 6s 95ms/step - loss: 0.6723 - accuracy: 0.7542 - val_loss: 0.7871 - val_accuracy: 0.7850
Epoch 26/30
60/60 [==============================] - 4s 71ms/step - loss: 0.6710 - accuracy: 0.7526 - val_loss: 1.5106 - val_accuracy: 0.6098
Epoch 27/30
60/60 [==============================] - 6s 96ms/step - loss: 0.6902 - accuracy: 0.7468 - val_loss: 2.1729 - val_accuracy: 0.5701
Epoch 28/30
60/60 [==============================] - 5s 77ms/step - loss: 0.6740 - accuracy: 0.7523 - val_loss: 0.8101 - val_accuracy: 0.7640
Epoch 29/30
60/60 [==============================] - 5s 87ms/step - loss: 0.6675 - accuracy: 0.7515 - val_loss: 1.1084 - val_accuracy: 0.7243
Epoch 30/30
60/60 [==============================] - 4s 69ms/step - loss: 0.6539 - accuracy: 0.7600 - val_loss: 0.8103 - val_accuracy: 0.7757
In [ ]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
In [ ]:
accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 0s - loss: 0.8511 - accuracy: 0.7411 - 72ms/epoch - 5ms/step

Plotting the Confusion Matrix

In [ ]:
# Here we would get the output as probablities for each category
y_pred=model1.predict(X_test_normalized)
15/15 [==============================] - 0s 2ms/step
In [ ]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
In [ ]:
# Plotting the classification report
# from sklearn.metrics import classification_report
cr=classification_report((y_test_arg), y_pred_arg)     # Complete the code to plot the classification report
print(cr)
              precision    recall  f1-score   support

           0       0.22      0.08      0.11        26
           1       0.87      0.87      0.87        39
           2       0.84      0.55      0.67        29
           3       0.90      0.90      0.90        61
           4       0.56      0.41      0.47        22
           5       0.88      0.62      0.73        48
           6       0.58      0.92      0.71        65
           7       0.76      0.86      0.81        22
           8       0.69      0.87      0.77        52
           9       0.56      0.39      0.46        23
          10       0.78      0.90      0.83        50
          11       0.93      0.74      0.82        38

    accuracy                           0.74       475
   macro avg       0.72      0.68      0.68       475
weighted avg       0.74      0.74      0.73       475

Observation¶

  • The model achieved an improved accuracy of around 75% on the test data compared to the previous model with an accuracy of 70%.
  • The model still seems to be overfitting the training data, as indicated by the gap between the training and validation accuracy curves.
  • The model's performance has improved on some classes, such as classes 6 and 3, which are now better classified.
  • However, the model is still struggling with some classes, such as classes 0, 2, 4, and 9, which are still being misclassified.
  • Further improvements can be made to the model by exploring other techniques such as hyperparameter tuning, regularization, and different model architectures.
  • Data augmentation has shown to be effective in improving the model's performance, and further exploration of different augmentation techniques could potentially lead to further improvements.
In [140]:
# Code to monitor val_accuracy
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
                                            patience=3,
                                            verbose=1,
                                            factor=0.5,
                                            min_lr=0.00001)

Transfer Learning (VGG16) with Data Agumentation¶

In [ ]:
from tensorflow.keras.applications import VGG16
# Loading VGG16 model
model2 = VGG16(weights='imagenet')
# Summary of the whole model
model2.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
553467096/553467096 [==============================] - 3s 0us/step
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 fc1 (Dense)                 (None, 4096)              102764544 
                                                                 
 fc2 (Dense)                 (None, 4096)              16781312  
                                                                 
 predictions (Dense)         (None, 1000)              4097000   
                                                                 
=================================================================
Total params: 138357544 (527.79 MB)
Trainable params: 138357544 (527.79 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
from tensorflow.keras.models import Model
# Getting only the conv layers for transfer learning.
transfer_layer = model2.get_layer('block5_pool')
vgg_model = Model(inputs=model2.input, outputs=transfer_layer.output)
In [ ]:
vgg_model.summary()
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
=================================================================
Total params: 14714688 (56.13 MB)
Trainable params: 14714688 (56.13 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
vgg_model = VGG16(weights='imagenet', include_top = False, input_shape = (64, 64, 3))
vgg_model.summary()
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 64, 64, 3)]       0         
                                                                 
 block1_conv1 (Conv2D)       (None, 64, 64, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 64, 64, 64)        36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 32, 32, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 32, 32, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 32, 32, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 16, 16, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 16, 16, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 16, 16, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 16, 16, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 8, 8, 256)         0         
                                                                 
 block4_conv1 (Conv2D)       (None, 8, 8, 512)         1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 8, 8, 512)         2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 8, 8, 512)         2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 4, 4, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 4, 4, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 2, 2, 512)         0         
                                                                 
=================================================================
Total params: 14714688 (56.13 MB)
Trainable params: 14714688 (56.13 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
# Making all the layers of the VGG model non-trainable. i.e. freezing them
for layer in vgg_model.layers:
    layer.trainable = False
In [ ]:
for layer in vgg_model.layers:
    print(layer.name, layer.trainable)
input_1 False
block1_conv1 False
block1_conv2 False
block1_pool False
block2_conv1 False
block2_conv2 False
block2_pool False
block3_conv1 False
block3_conv2 False
block3_conv3 False
block3_pool False
block4_conv1 False
block4_conv2 False
block4_conv3 False
block4_pool False
block5_conv1 False
block5_conv2 False
block5_conv3 False
block5_pool False
In [ ]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
In [ ]:
# Initializing the model
new_model = Sequential()

# Adding the convolutional part of the VGG16 model from above
new_model.add(vgg_model)

# Flattening the output of the VGG16 model because it is from a convolutional layer
new_model.add(Flatten())

# Adding a dense input layer
new_model.add(Dense(32, activation='relu'))
# Adding dropout
new_model.add(Dropout(0.2))
# Adding second input layer
new_model.add(Dense(32, activation='relu'))
# Adding output layer
new_model.add(Dense(12, activation='softmax'))
In [ ]:
# Compiling the model
new_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Summary of the model
new_model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 vgg16 (Functional)          (None, 2, 2, 512)         14714688  
                                                                 
 flatten (Flatten)           (None, 2048)              0         
                                                                 
 dense (Dense)               (None, 32)                65568     
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense_1 (Dense)             (None, 32)                1056      
                                                                 
 dense_2 (Dense)             (None, 12)                396       
                                                                 
=================================================================
Total params: 14781708 (56.39 MB)
Trainable params: 67020 (261.80 KB)
Non-trainable params: 14714688 (56.13 MB)
_________________________________________________________________
In [ ]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
In [ ]:
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=5)
mc = ModelCheckpoint('best_model.h5', monitor='val_accuracy', mode='max', verbose=1, save_best_only=True)

## Fitting the VGG model

new_model_history = new_model.fit(
            X_train_normalized, y_train_encoded,
            epochs=30,
            validation_data=(X_val_normalized,y_val_encoded),
            batch_size=32,
            verbose=2)
Epoch 1/30
121/121 - 6s - loss: 2.2561 - accuracy: 0.2246 - val_loss: 2.0508 - val_accuracy: 0.2921 - 6s/epoch - 49ms/step
Epoch 2/30
121/121 - 2s - loss: 1.9005 - accuracy: 0.3491 - val_loss: 1.7711 - val_accuracy: 0.4065 - 2s/epoch - 15ms/step
Epoch 3/30
121/121 - 2s - loss: 1.6848 - accuracy: 0.4227 - val_loss: 1.6406 - val_accuracy: 0.4533 - 2s/epoch - 16ms/step
Epoch 4/30
121/121 - 2s - loss: 1.5423 - accuracy: 0.4627 - val_loss: 1.5226 - val_accuracy: 0.4556 - 2s/epoch - 17ms/step
Epoch 5/30
121/121 - 2s - loss: 1.4385 - accuracy: 0.4936 - val_loss: 1.4619 - val_accuracy: 0.4720 - 2s/epoch - 16ms/step
Epoch 6/30
121/121 - 2s - loss: 1.3654 - accuracy: 0.5126 - val_loss: 1.3796 - val_accuracy: 0.5140 - 2s/epoch - 16ms/step
Epoch 7/30
121/121 - 2s - loss: 1.2976 - accuracy: 0.5370 - val_loss: 1.3342 - val_accuracy: 0.5561 - 2s/epoch - 17ms/step
Epoch 8/30
121/121 - 2s - loss: 1.2458 - accuracy: 0.5524 - val_loss: 1.3341 - val_accuracy: 0.5070 - 2s/epoch - 16ms/step
Epoch 9/30
121/121 - 2s - loss: 1.1887 - accuracy: 0.5729 - val_loss: 1.3390 - val_accuracy: 0.5304 - 2s/epoch - 16ms/step
Epoch 10/30
121/121 - 2s - loss: 1.1631 - accuracy: 0.5851 - val_loss: 1.2765 - val_accuracy: 0.5537 - 2s/epoch - 16ms/step
Epoch 11/30
121/121 - 2s - loss: 1.1150 - accuracy: 0.6064 - val_loss: 1.2737 - val_accuracy: 0.5444 - 2s/epoch - 17ms/step
Epoch 12/30
121/121 - 2s - loss: 1.1060 - accuracy: 0.5992 - val_loss: 1.1904 - val_accuracy: 0.5888 - 2s/epoch - 16ms/step
Epoch 13/30
121/121 - 2s - loss: 1.0510 - accuracy: 0.6174 - val_loss: 1.2215 - val_accuracy: 0.5491 - 2s/epoch - 17ms/step
Epoch 14/30
121/121 - 2s - loss: 1.0268 - accuracy: 0.6327 - val_loss: 1.2078 - val_accuracy: 0.5771 - 2s/epoch - 17ms/step
Epoch 15/30
121/121 - 2s - loss: 1.0026 - accuracy: 0.6382 - val_loss: 1.1842 - val_accuracy: 0.5841 - 2s/epoch - 17ms/step
Epoch 16/30
121/121 - 2s - loss: 0.9705 - accuracy: 0.6488 - val_loss: 1.2302 - val_accuracy: 0.5748 - 2s/epoch - 16ms/step
Epoch 17/30
121/121 - 2s - loss: 0.9402 - accuracy: 0.6642 - val_loss: 1.2064 - val_accuracy: 0.5771 - 2s/epoch - 16ms/step
Epoch 18/30
121/121 - 2s - loss: 0.9425 - accuracy: 0.6561 - val_loss: 1.1734 - val_accuracy: 0.5981 - 2s/epoch - 17ms/step
Epoch 19/30
121/121 - 2s - loss: 0.8971 - accuracy: 0.6790 - val_loss: 1.2040 - val_accuracy: 0.5794 - 2s/epoch - 16ms/step
Epoch 20/30
121/121 - 2s - loss: 0.8738 - accuracy: 0.6875 - val_loss: 1.2519 - val_accuracy: 0.5724 - 2s/epoch - 16ms/step
Epoch 21/30
121/121 - 2s - loss: 0.8661 - accuracy: 0.6888 - val_loss: 1.1600 - val_accuracy: 0.5818 - 2s/epoch - 16ms/step
Epoch 22/30
121/121 - 2s - loss: 0.8450 - accuracy: 0.7039 - val_loss: 1.1815 - val_accuracy: 0.6028 - 2s/epoch - 17ms/step
Epoch 23/30
121/121 - 2s - loss: 0.8287 - accuracy: 0.6990 - val_loss: 1.1337 - val_accuracy: 0.5981 - 2s/epoch - 16ms/step
Epoch 24/30
121/121 - 2s - loss: 0.8281 - accuracy: 0.7050 - val_loss: 1.1452 - val_accuracy: 0.6098 - 2s/epoch - 17ms/step
Epoch 25/30
121/121 - 2s - loss: 0.8054 - accuracy: 0.7057 - val_loss: 1.1620 - val_accuracy: 0.5888 - 2s/epoch - 17ms/step
Epoch 26/30
121/121 - 2s - loss: 0.7709 - accuracy: 0.7195 - val_loss: 1.1714 - val_accuracy: 0.6098 - 2s/epoch - 17ms/step
Epoch 27/30
121/121 - 2s - loss: 0.7680 - accuracy: 0.7154 - val_loss: 1.2238 - val_accuracy: 0.5981 - 2s/epoch - 17ms/step
Epoch 28/30
121/121 - 2s - loss: 0.7555 - accuracy: 0.7333 - val_loss: 1.2148 - val_accuracy: 0.6051 - 2s/epoch - 16ms/step
Epoch 29/30
121/121 - 2s - loss: 0.7449 - accuracy: 0.7369 - val_loss: 1.1819 - val_accuracy: 0.6238 - 2s/epoch - 17ms/step
Epoch 30/30
121/121 - 2s - loss: 0.7267 - accuracy: 0.7388 - val_loss: 1.1264 - val_accuracy: 0.6262 - 2s/epoch - 17ms/step

Plotting Accuracy vs Epoch Curve

In [ ]:
plt.plot(new_model_history.history['accuracy'])
plt.plot(new_model_history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'Val'], loc='upper left')
plt.show()
In [ ]:
accuracy = new_model.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 0s - loss: 1.1387 - accuracy: 0.6358 - 333ms/epoch - 22ms/step
In [ ]:
normal_y_test = np.argmax(y_test_encoded, axis=1)
In [ ]:
# Here we would get the output as probablities for each category
y_test_pred_ln3=new_model.predict(X_test_normalized)
y_test_pred_classes_ln3 = np.argmax(y_test_pred_ln3, axis=1)
15/15 [==============================] - 1s 20ms/step
In [ ]:
import seaborn as sns
from sklearn.metrics import accuracy_score, confusion_matrix
accuracy_score(normal_y_test, y_test_pred_classes_ln3)
Out[ ]:
0.6357894736842106
In [ ]:
y_pred=new_model.predict(X_test_normalized)
15/15 [==============================] - 0s 19ms/step
In [ ]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
In [ ]:
# Plotting the classification report
# from sklearn.metrics import classification_report
cr=classification_report((y_test_arg), y_pred_arg)     # Complete the code to plot the classification report
print(cr)
              precision    recall  f1-score   support

           0       0.50      0.27      0.35        26
           1       0.54      0.56      0.55        39
           2       0.61      0.69      0.65        29
           3       0.75      0.82      0.78        61
           4       0.41      0.32      0.36        22
           5       0.58      0.65      0.61        48
           6       0.67      0.88      0.76        65
           7       0.78      0.64      0.70        22
           8       0.59      0.63      0.61        52
           9       0.11      0.04      0.06        23
          10       0.76      0.74      0.75        50
          11       0.70      0.61      0.65        38

    accuracy                           0.64       475
   macro avg       0.58      0.57      0.57       475
weighted avg       0.62      0.64      0.62       475

Observation¶

  • The overall accuracy of the model is 62%, which suggests it correctly classifies 62% of the images in the dataset.
  • The model is still struggling with some classes, such as classes 0, 2, 4, and 9, which are still being misclassified.

Final Model¶

Comment on the final model you have selected and use the same in the below code to visualize the image.

  • We will select the 2nd model - CNN with Data Augementation

Visualizing the prediction¶

In [125]:
# Visualizing the predicted and correct label of images from test data
plt.figure(figsize=(2,2))
plt.imshow(X_test[2])
plt.show()
## Complete the code to predict the test data using the final model selected
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[2].reshape(1,64,64,3)))))   # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[2])                                               # using inverse_transform() to get the output label from the output vector

plt.figure(figsize=(2,2))
plt.imshow(X_test[33])
plt.show()
## Complete the code to predict the test data using the final model selected
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[33].reshape(1,64,64,3)))))  # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[33])                                              # using inverse_transform() to get the output label from the output vector

plt.figure(figsize=(2,2))
plt.imshow(X_test[59],)
plt.show()
## Complete the code to predict the test data using the final model selected
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[59].reshape(1,64,64,3)))))  # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[59])                                              # using inverse_transform() to get the output label from the output vector

plt.figure(figsize=(2,2))
plt.imshow(X_test[36])
plt.show()
## Complete the code to predict the test data using the final model selected
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[36].reshape(1,64,64,3)))))  # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[36])
1/1 [==============================] - 0s 31ms/step
Predicted Label ['Cleavers']
True Label Common Chickweed
1/1 [==============================] - 0s 46ms/step
Predicted Label ['Cleavers']
True Label Cleavers
1/1 [==============================] - 0s 73ms/step
Predicted Label ['Cleavers']
True Label Common Chickweed
1/1 [==============================] - 0s 59ms/step
Predicted Label ['Charlock']
True Label Shepherds Purse

Actionable Insights and Business Recommendations¶

Key takeaways for the business:

  • The model achieved an accuracy of 75% on the test data. This means that the model is able to correctly classify 75 out of 100 images.
  • The model is still overfitting the training data. This means that the model is not able to generalize to unseen data.
  • Further improvements can be made to the model by exploring other techniques such as hyperparameter tuning, regularization, and different model architectures.
  • Data augmentation has shown to be effective in improving the model's performance. Further exploration of different augmentation techniques could potentially lead to further improvements.
  • The model can be used to automate the process of plant disease identification. This could save time and money for farmers and other agricultural businesses.
  • The model could also be used to develop new plant disease diagnostic tools. These tools could help farmers and other agricultural businesses to identify and treat plant diseases more quickly and effectively.

Conclusion:

The model developed in this project has the potential to be a valuable tool for farmers and other agricultural businesses. The model can be used to automate the process of plant disease identification, which could save time and money. The model could also be used to develop new plant disease diagnostic tools, which could help farmers and other agricultural businesses to identify and treat plant diseases more quickly and effectively.

Further work:

  • Further improvements can be made to the model by exploring other techniques such as hyperparameter tuning, regularization, and different model architectures.
  • The model could be tested on a larger dataset to assess its generalizability.
  • The model could be integrated into a mobile app or web service to make it more accessible to farmers and other agricultural businesses.